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We examine partisan differences in tlie beliavior, communication patterns and social interactions of 
more than 18, 000 politically-active Twitter users to produce evidence that points to changing levels of 
partisan engagement with the American online political landscape. Analysis of a network defined by 
the communication activity of these users in proximity to the 2010 midterm congressional elections 
reveals a highly segregated, well clustered partisan community structure. Using cluster membership 
as a high-fidelity (87% accuracy) proxy for political affiliation, we characterize a wide range of dif- 
ferences in the behavior, communication and social connectivity of left- and right-leaning Twitter 
users. We find that in contrast to the online political dynamics of the 2008 campaign, right-leaning 
Twitter users exhibit greater levels of political activity, a more tightly interconnected social structure, 
and a communication network topology that facilitates the rapid and broad dissemination of political 
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I. INTRODUCTION 



Digitally-mediated communication has become an integral part of the American political landscape, providing citizens access 
to an unprecedented wealth of information and organizational resources for political activity. So pervasive is the influence 
of digital communication on the political process that almost one quarter (24%) of American adults got the majority of their 
news about the 2010 midterm congressional elections from online sources, a figure that has increased three-fold since the Pew 



Research Center began monitoring the statistic during the 2002 campaign (Pew Internet and American Life Project 2010a i. 
Relax the constraint that a majority of a person's political news and information must come from online sources and the figure 
jumps to include the 54% of adult Americans who went online in 2010 to get political information. Critically, this activity 
precipitates tangible changes in the beliefs and behaviors of voters, with 35% of internet users who voted in 2010 reporting that 



political information they saw or read online made them decide to vote for or against a particular candidate ( |Pew Internet and 
American Life Project! 2010a| l 



Within this ecosystem of digital information resources, social media platforms play an especially important role in facilitat- 



ing the spread of information by connecting and giving voice to the voting public (Aday et al. 2010 Bennett 2003 [ Farrell 



and Drezner 2008 1. Networked and unmoderated, social media are characterized by the large-scale creation and exchange of 



user-generated content (Kapl an and Haenlein| |20T0[ |, a production and consumption model that stands in stark contrast to the 
centralized editorial and distribution processes typical of traditional media outlets ( iBenkler 2006 Sunstein 2007 1. 

In terms of political organization and engagement, the benefits of social media use are many. For voters, social media make 
it easier to share political information, draw attention to ideological issues, and facilitate the formation of advocacy groups with 



low barriers to entry and participation (Garrett 2006 Tolbert and McNeal 2003 1. The ease with which individual voters can 
connect with one another directly also makes it easier to aggregate small-scale acts, as in the case of online petitions, fundraising, 
or web-based phonebanking ( |Land||2009| l. Together, these features contribute to the widespread use of social media for political 
purposes among the voting public, with as many as 21% of online adults using social networking sites to engage with the 2010 
congressional midterm elections (Pew Internet and American Life Project 2010c| l. Moreover, a survey by the Pew Internet and 
American Life Project finds that online political activity is correlated with more traditional forms of political participation, with 
individuals who use blogs or social networking sites as a vehicle for civic engagement being more likely to join a political or 
civic group, compared to other internet users (Pew Internet and American Life Project 2010b[ ). 

Likewise, candidates and traditional political organizations benefit from a constituency that is actively engaged with social 
media, finding it easier to raise money, organize volunteers and communicate directly with voters who use social media plat- 



forms (Lutz 2009 1. Social media also facilitate the rapid dissemination of political frames, making it easy for key talking points 
to be communicated directly to a large number of constituents, rather than having to subject messages to the traditional media 
filter. 

Considered in this light, it becomes clear why social media were argued to have played such an important role in the political 



success of the Democratic party in the 2008 presidential and congressional elections (Carr 2008[ Creamer 2008[ Holahan 



2008 1. Survey data from the Pew Research Center showed that, along the seven dimensions used to measure onUne political 



activity, Obama voters were substantially more likely to use the internet as an outlet for political activity ( |Pew Internet and| 



American Life Project 2008 1. In particular, Obama voters were more likely than McCain voters to create and share political 



cselectontent, and to engage politically on an online social network (Pew Internet and American Life Project 2008| l. Moreover, 
a 2009 Edelman report found that in addition to a thirteen million member e-mail list, the Obama campaign enjoyed twice as 
much web traffic, had four times as many YouTube viewers and five times more Facebook friends compared to the McCain 
campaign ( jLutz[[2009] ). While the direct effect of any one media strategy on the success of a campaign is difficult to assess and 
quantify, the data show that Obama campaign had a clear advantage in terms of online voter engagement. 

Motivated by the connection between the widely reported advantage in on-line mobilization and the result of the 2008 presi- 
dential election, we seek to understand structural shifts in the American political landscape with respect to partisan asymmetries 
in online political engagement. We work toward this goal by examining partisan differences in the behavior, communication 
patterns and social interactions of more than 18, 000 politically-active users of the popular social media platform Twitter Among 
all social media services. Twitter makes an appealing analytical target for a number of reasons: the public nature of its content, 
the accessibility of the data through APIs, a strong focus on news and information sharing, and its prominence as a platform for 
political discourse in America and abroad ( [Howard et al.[[20TT||Kwak et al.||2010| l. These features make a compelling case for 
using this platform to study partisan political activity. 

For this analysis we build on the findings of a previous study which established the macroscopic structure of domestic po- 
litical communication on Twitter, a social networking platform that allows individuals to create and share brief 140-character 
messages. In that work we employed clustering techniques and qualitative content analysis to demonstrate that the network of 
political retweets exhibits a highly segregated, partisan structure ( Conover et al. 201 lap . Despite this segregation, we found that 
politically left- and right-leaning individuals engage in interaction across the partisan divide using mentions, a behavior strongly 
correlated with a type of cross-ideological provocation we term 'content injection.' 
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Having established the large-scale structure of these communication networks, in this study we employ a variety of methods 
to provide a more detailed picture of domestic political communication on Twitter. We characterize a wide range of differences 
in the behavior, communication, geography and social connectivity of thousands of politically left- and right-leaning users. 
Specifically, we demonstrate that right-leaning Twitter users exhibit greater levels of political activity, tighter social bonds, and a 
communication network topology that facilitates the rapid and broad dissemination of political information, a finding that stands 
in stark contrast to the online political dynamics of the 2008 campaign. 

With respect to individual-level behaviors, we find that right-leaning Twitter users produce more than 50% more total political 
content and devote a greater proportion of their time to political discourse. Right-leaning users are also more likely to use 
hyperlinks to share and refer to external content, and are almost twice as likely than left-leaning users to self-identify their 
political alignment in their profile biographies. At the individual level, these behavioral factors paint a picture of a right-leaning 
constituency comprised of highly-active, politically-engaged social media users, a trend we see reflected in the communication 
and social networks in which these individuals participate. 

Regarding connectivity patterns among users in these two communities we report findings related to three different networks, 
described by the set of explicitly declared follower/followee relationships, mentions, and retweets. Casting the declared follower 
network as the social substrate over which political information is most likely to spread, we find that right-leaning users exhibit a 
greater propensity for mutually-affirmed social ties, and that right-leaning users tend to form connections with a greater number 
of individuals in total compared to those on the left. With respect to the way in which information actually propagates over this 
substrate in the form of retweets, right-leaning users enjoy a network structure that is more likely to facilitate the rapid and broad 
dissemination of political information. Additionally, right-leaning users exhibit a higher probability to rebroadcast content from 
and to be rebroadcast by a large number of users, and are more likely to be members of high-order retweet network fc-cores and 
k-cliques, structural features that are associated with the efficient spreading of information and adoption of political behavior 
and opinions. Pointing definitively to a vocal, socially engaged, densely interconnected constituency of right-leaning users, 
these topological and behavioral features provide a significantly more nuanced perspective on political communication on this 
important social media platform. Moreover, through its use digital trace data to illuminate a complex sociological phenomenon, 
this article illustrates the explanatory power of data science techniques and underscores the potential of this burgeoning scientific 
epistemology. 



II. PLATFORM & DATA 
A. The Twitter Platform 

Twitter is a popular social networking and microblogging site where users can post 140-character messages containing text 
and hyperlinks, called tweets, and interact with one another in a variety of ways. In the present section we describe four of the 
platform's key features: follow relationships, retweets, mentions, and hashtags. 

Twitter allows each user to broadcast tweets to an audience of users who have elected to subscribe to the stream of content 
he or she produces. The act of subscribing to a user's tweets is known as following, and represents a directed, non-reciprocal 
social link between two users. From a content consumption perspective, each user can sample tweets from a variety of content 
streams, including the stream of tweets produced by the users he or she follows, as well as the set of tweets containing specific 
keywords known as hashtags. 

Hashtags are tokens prepended with a pound sign (i.e. #token) which, when displayed, function as a hyperlink to the stream 
of recent tweets containing the specified tag ( Java et aTj |2007) l. While they can be used to specify the topic of a tweet (i.e. 



#oil or #taxes), when used in political communication hashtags are commonly employed to identify one or more intended 
audiences, as in the case of the most popular political hashtags, #tcot and #p2, acronyms for "Top Conservative on Twitter" 
and "Progressives 2.0," respectively. In this way, hashtags function to broaden the audience of a tweet, extending its visibility 
beyond a person's immediate followers to include all users who seek out content associated with the tag's topic or audience. For 



this reason, as outlined in Section III. A we restrict our analysis to the set of tweets containing political hashtags, ensuring that 
the content under study is broadly public and expressly political in nature. 

In addition to broadcasting tweets to the public at large. Twitter users can interact directly with one another in two primary 
ways: retweets and mentions. Retweets often act as a form of endorsement, allowing individuals to rebroadcast content generated 
by other users, thus raising the content's visibility ( boyd et aTj 2008)1. Mentions allow someone to address a specific user directly 



through the public feed, or, to a lesser extent, refer to an individual in the third person. In this study, we differentiate between 
mentions that occur in the body of the tweet and those that occur at the beginning of a tweet, as they correspond to distinct modes 
of interaction. Mentions located at the beginning of a tweet are known as 'replies', and typically represent actual engagement. 



while mentions in the body of a tweet typically constitute a third-person reference ( [Honeycutt and Herring 2008 1. Together, 



retweets and mentions act as the primary mechanisms for explicit, public user-user interaction on Twitter. 
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TABLE I Political hashtags related to #p2 and #tcot (acronyms for 'Progressives 2.0' and 'Top Conservatives on Twitter'). Tweets con- 
taining any of these were included in our sample. 



Just #p2 


#casen #dadt #dcl0210 #democrats #dul #fem2 #gotv #kysen #lgf #ofa tonenation #p2b 
#pledge frebelleft #truthout #vote #vote2010 #whyimvotingdemocrat #youcut 


Both 


#cspj #dem #dems Idesen #gop #hcr #nvsen #obama #ocra #p2 #p21 #phnm #politics #sgp 
#tcot #teaparty #tlot #topprog #tpp #twisters #votedein 


Just #tcot 


#912 #ampat #ftrs #glennbeck #hhrs #iamthemob #ma04 #mapoli #palin #palinl2 #spwbt 
#t sot # tweet congress #ucot #wethepeople 



B. Data 

The analysis described in this article relies on data collected from the Twitter 'gardenhose' streaming APl[^between Septem- 
ber 1** and January 7*'', 2011 — the eighteen week period surrounding the November 4*'* United States congressional midterm 
elections. The gardenhose provides a sample of approximately 10% of the entire Twitter corpus in a machine-readable format. 
Each tweet entry is composed of several fields, including a unique identifier, the content of the tweet (including hashtags and hy- 
perlinks), the time it was produced, the username of the account that produced the tweet, and in the case of retweets or mentions, 
the account names of the other users associated with the tweet. 

From this eighteen week period we collected data on 6,747 right-leaning users and 10,741 left-leaning users, responsible for 
producing a total of 1,390,528 and 2,420,370 tweets, respectively. It's useful to note that we evaluate all gardenhose tweets 
associated with each user, rather than just those containing political hashtags, in order to facilitate comparisons between the two 
groups in terms of relative proportions of attention allocated to political communication. 



III. METHODOLOGY 

In order to examine differences in the behavior and connectivity of left- and right-leaning Twitter users we rely on the political 
hashtags and partisan cluster membership labels established in a previous study on political polarization. In addition to reviewing 
the approach used to establish these features, we show that the networks and communities under study are representative of 
domestic political communication on Twitter in general. 



A. Identifying Political Content 



As outlined in Section II. A hashtags are used to specify the topic or intended audience of a tweet, and allow a user to engage 
a much larger potential audience than just his or her immediate followers. We define the set of pertinent political communication 
as any tweet containing at least one poUtical hashtag. While an individual can engage in political communication without 
including a hashtag, the potential audience for such content is limited primarily to his or her immediate followers. Moreover, 
restricting our analysis to tweets which have been expressly identified as political in nature allows us to define a high-fidelity 



corpus, avoiding the lisk of introducing undue noise through the use of topic detection strategies (Blei et al. 2003 Landauer 
|erai[T998] l. 

To isolate a representative set of political hashtags and to avoid introducing bias into the dataset we performed a simple 
algorithmic hashtag discovery procedure. We began by seeding our sample with the two most popular political hashtags, #p2 
("Progressives 2.0") and #tcot ("Top Conservatives on Twitter"). For each seed we identified the set of hashtags with which 
it co-occuiTed in at least one tweet, and ranked the results using the Jaccard coefficient. For a set of tweets S containing a seed 
hashtag, and a set of tweets T containing a second hashtag, the Jaccard coefficient between S and T is 

Thus, when the tweets in which both seed and the second hashtag occur make up a large portion of the tweets in which either 
occurs, the two are deemed to be related. Using a similarity threshold of 0.005 we identified sixty six unique hashtags (Table|l]i, 
eleven of which were excluded due to overly-broad or ambiguous meanings (Table [ll|. While it is a common practice among 
spammers to contribute content to popular hashtag streams, we do not believe this phenomenon plays a substantial role in a 
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TABLE II Hashtags excluded from the analysis due to ambiguous or overly broad meaning. 

Excl. from #p2 #economy #gay #glbt #us #wc #lgbt 



Excl. from both 



#israel #rs 



Excl. from #tCOt #news #qsn tpoliticalhumor 
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FIG. 1 Hashtag popularity decay in terms of total number of tweets and users associated with each tag. On the horizontal axis tags have been 
ordered according to one of the two popularity measures: number of tweets (bottom) and users (top). The roughly exponential decay indicates 
that the inclusion of additional hashtags is unlikely to result in a substantial increase in the size of the corpus. 



shaping the structure of the sample data. During a previous study we found that of 1,000 manually-inspected accounts identified 
by this methodology fewer than 3% corresponded to foreign language or spam activity (Conover et al. 2011a[ l. 



B. Representativeness 



Using the technique outlined above we identified many high-profile political hashtags, and with them the majority of tweets 
and users associated with domestic political communication on Twitter. Supporting this claim. Figure [T] shows a roughly expo- 
nential decay in hashtag popularity as measured in terms of number of users or tweets associated with the hashtag. This sharp 
decay in the tag popularity indicates that the inclusion of additional political hashtags is not likely to substantially increase the 
size or alter the structure of the corpus. 

This claim is also supported by Figure |2] which shows that there is a strong effect of diminishing returns with respect to the 
observed number of unique users and tweets as the number of hashtags included in our analysis increases. This effect is due to 
the fact that many tweets are annotated with multiple hashtags, and many users utilize several different hashtags over the course 
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FIG. 2 Size of the set of unique users and tweets resulting from tiie inclusion of additional hashtags. Axes are ordered according to tire total 
number of tweets (top) and users (bottom) associated with each tag. 



of the study period. As a result, the inclusion of a single hashtag may result in the inclusion of many tweets and users also 
redundantly associated with other hashtags. 

To further support the claim that sampling based on this set of hashtags produces a representative set of political tweets, 
we selected all the tweets in the gardenhose from the study period that included any one of 2500 hand-selected political key- 
words related to the 2010 elections (Ratkiewicz et al. 201 1\ . We considered only the 312,560 tweets in this set containing a 
hashtag because we use this characteristic to define public political communication on Twitter. We found that 26.4% of these 
tweets are covered by our target set of hashtags. Furthermore, among the ten most popular hashtags not included in our tar- 
get set (#2010memories, #2010disappointments, #ff, #p2000, #2010, #business, #uk, #newsjp, #asia, 
#sports), only one is explicitly political and its volume accounts for less than 2% of public political communication. This 
coverage confirms that we have isolated a substantial and representative sample of political communication on Twitter. 



C. Inferring Political Identities from Communication Networks 

In a previous study we used the set of political tweets from the six weeks preceding the 2010 midterm election to build a 
network representing political retweet interactions among Twitter users. In this network an edge runs from a node representing 
user ^ to a node representing user B if B retweets content originally broadcast by A, indicating that information has propagated 
from A to B. This network consists of 23, 766 non-isolate nodes among a total of 45, 365, with 18, 470 nodes in its largest 
connected component and 102 nodes in the next-largest component. We describe the construction of an analogous network of 
political mentions in Section [V!C] 

Using a combination of network clustering algorithms and manually-annotated data we determined that the network of po- 
litical retweets neatly divides the population of users in the largest connected component into two distinct communities (Fig- 
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TABLE III Partisan composition of retweet cluster communities as determined tiirougii manual annotation of 1,000 random users. (See 

§|nrcj. 



Cluster 


Left 


Right 


Undecidable # Nodes 


A (Top) 


1.19% 


93.4% 


5.36% 7, 115 


B (Bottom) 


80.1% 


8.71% 


11.1% 11,355 



ure 



V.B I (IConover et aL 201 laV In brief, we used Rhaghavan's label propagation method seeded with node labels determined 



by Newman's leading eigenvector modularity maximization method to assign cluster membership to each node ( |Newman 2006 



Raghavan et al. 2007| l. The final community assignments are consistent and robust to fluctuations in starting conditions ( Conover 
et al.j |2011a I. To determine whether these communities were composed of users from the political left and right, respectively, 
we used qualitative content analysis evaluate the tweets produced by 1,000 random users appearing in the intersection of the 
mention and retweet networks (Kolbe 1 99 1 [ |Krippendorff| [2004 



To establish the reproducibility of these results we had two authors, working independently, determine whether the content of 
a user's tweets express a 'left', 'right' or 'undecidable' political identity according to the coding rubric developed in a previous 
study (Conover et al. 201 la[ ). These annotations were compared against the work of an independent non-author judge, and 
using a well-established measure of inter- annotator agreement we report 'nearly perfect' inter-annotator agreement between 
author and non-author annotations for the 'left' and 'right' classes (Cohen's Kappa values of .80 and .82, respectively) and 'fair 
to moderate' agreement for the 'undecidable' category (Cohen's Kappa value of .42) (Kolbe 1 99 1 [ pCrippendorff[ |2004[ l . From 
these high levels of inter-annotator agreement we conclude that an objective outside party would be able to reproduce our class 
assignments for most users. 

Based on this content analysis, we determined that the retweet network communities are highly politically homogeneous, 
consisting of 80.1% left- and 93.4% right-leaning users, respectively (Table IIIi (Conover et al. 2011al. In this study we use 
network community membership as a proxy for the political identities of all 18, 470 users in the largest connected component 
of the retweet network, and hereafter focus on the behavior of these users. Based on the relative proportions of right- and left- 
leaning users identified during the qualitative content analysis stage, this mechanism results in correct predictions for 87.3% of 



users in the largest connected component of the retweet network ( Conover et al. 201 lb I. 

In the following sections we leverage these data to explore, in detail, how users from the political left and right utilize this 
important social media platform for political activity in different ways. 



IV. BEHAVIOR: INDIVIDUAL-LEVEL POLITICAL ACTIVITY 

Before examining structural differences in the social and communication networks of left- and right-leaning Twitter users, we 
first focus on political activity at the individual level. In this section we compare users in the left- and right-leaning communities 
in terms of their relative rates of content production, the amount of attention they allot to political communication, their respective 
rates of political self-identification, and their propensity for sharing information resources in the form of hyperlinks. 

Right-leaning users are substantially more active and politically engaged with this social media platform. Specifically, our 
analysis shows that left-leaning users produce less total political content, allocate proportionally less time to creating political 
content, are less likely to reveal their political ideology in their profile biography, and are less likely to share resources in the 
form of hyperlinks. All of these findings stand in stark contrast to survey data and media reportage of the 2008 online political 
dynamics, and provide evidence in support of the notion that right-leaning voters are becoming more politically engaged online. 



A. Political Communication 

From the perspective of leveraging social media for political organization, the baseline level of activity among a constituency 
is one of the most important characteiistics of a population. Figure |3] shows that while left- and right-leaning users produce 
approximately the same number of tweets per user, right-leaning individuals actually produce 54% more total political content 
despite comprising fewer users altogether. This trend is the result of divergent priorities among left- and right-leaning users, as 
right-leaning users devote a substantially larger portion of their activity on Twitter to political communication. In fact, right- 
leaning users were almost twice as likely to create political content, with 22% of all tweets produced by right-leaning users 
containing one or more of the political hashtags under study, compared to only 12% for left-leaning users. 
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10° 10^ 10^ 10^ 10^ 10^ 10° 10^ 10^ 10^ 10"* 

tweets political tweets 

FIG. 3 Total number of tweets produced by right- and left-leaning users (left) compared to the total number of political tweets produced 
by users in each group. While both groups produce a comparable amount of content in general, right-leaning users produce a much larger 
number of political tweets despite comprising fewer users in total. We observe that users' behavior tends to be broadly distributed, with many 
individuals creating relatively few tweets, while a few individuals produce substantially larger volumes of content. Note, however, that this 
sample includes only users who produced at least one political hashtag, rather than a random sample among all Twitter users, a feature likely 
responsible for the low number of users who produce few total tweets. 



B. Partisan Self-Identification 

In addition to devoting a larger proportion of tweets to political content, right-leaning users are much more hkely to use their 
140-character profile 'biography' to explicitly self-identify their political alignment. A survey of the biographies of 400 random 
users from the set of individuals selected for quahtative content analysis (Section III.C I reveals that 38.7% of right-leaning users 
included reference to their political alignment in this valuable space, as compared with only 24.6% of users in the left-leaning 
community. Taken together, this analysis demonstrates that right-leaning users are much more likely to use Twitter as an outlet 
for political communication, and are substantially more inclined to view the Twitter platform as an explicitly political space. 



C. Resource Sharing 

One of the key functions of the Twitter platform is to serve as a medium for sharing information in the form of hyperlinks 



to external content (boyd et al. 2008 1. Given the constraints of the 140-character format, hyperlinking activity is especially 
important to the dissemination of detailed political information among members of a constituency. 

With respect to this aspect of online pohtical engagement, too, we see that right-leaning users are more active then those 
individuals in the left-leaning community. Among all tweets produced by users in the right-leaning community, 43.4% contained 
a hyperlink, compared with 36.5% of all tweets from left-leaning users. This trend is even more pronounced if we consider only 
resource sharing within the set of political tweets, with left-leaning users including a hyperlink in 50.8% of political tweets, as 
compared to right-leaning users, who include hyperlinks 62.5% of the time. From these observations we conclude that right- 
leaning users are more inclined to treat Twitter as a platform for aggregating and sharing links to web-based resources, an activity 
crucial to the efficient spread of political information on the Twitter platform. 



V. CONNECTIVITY: GLOBAL-LEVEL POLITICAL ACTIVITY 

Next, we turn our attention to structural differences in social interaction and communication networks of left- and right-leaning 
users. 
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FIG. 4 Force-directed layout of the follow relationships among politically-active Twitter users. Nodes are colored according to political 
identity, Connections to users who did not engage political communication on Twitter are not included. 



A. Follower Network 



We begin with an analysis of the network defined by the follower/followee relationships shared among members of these 
two groups (Figure [4]). Encoding the fact that a user subscribes to the content produced by another, the follower network is 
best understood as describing the social substrate over which information is likely flow between political actors on Twitter. 



Specifically, though not all connections in the follower network encode equally meaningful social relationships (Huberman 
et al. 20091), content is broadcast equally along all edges in this network. 



We examine the differences in the follower subgraphs induced by considering only connections between users of the same 



10 



TABLE IV Follower network statistics for the subgraphs induced by the set of edges among users of the same political affiliation. Reciprocity 
is defined as where De, is the number of dyads with an edge in each direction and D is the total number of dyads with at least one edge. 
Follower data was only available for a subset of the study population, owing to private or deleted accounts. 

Community Nodes Edges Avg. Degree Clust. Coeff. Reciprocity 
Left 9,941 803,329 80.80 0.134 42.8% 

Right 6,426 1,503,417 233.95 0.221 64.8% 
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FIG. 5 Log binned in- and out-degree distributions of the internal follower networlc at left, and right, respectively. As a result of considering 
only follower relationships among politically-active users we observe strong cutoffs in both distributions that make curve-fitting unreliable. 
However, comparing the two distributions it's clear that the right-leaning community has a much greater proportion of users with many 
followers (Kolmogorov-Smimov p < 10^'^), despite being comprised of fewer users in total. Understood as an information diffusion substrate, 
the proliferation of high-profile hubs gives a natural advantage to the right-leaning community. 



political affiliation. For the purposes of this analysis, a directed edge is drawn from user A to user i? if A is a follower of B. Basic 
statistics about these two subgraphs, including average degree, undirected clustering coefficient, and proportion of reciprocal 
links are presented in Table IV We see that along all dimensions, users in the right-leaning community are much more tightly 
interconnected, with a substantially higher average clustering coefficient and greater average degree. Additionally, we observe a 
higher proportion of reciprocal links between right-leaning users, indicating the presence of stronger, mutually-affirmed interest 
among individuals in this community. All of these factors indicate that right-leaning users are more tightly interconnected, 
resulting in a basic structural advantage with respect to the challenge of efficiently spreading political information on the Twitter 
platform. 

Using the Kolmogorov-Smirnov two-sample test to measure the degree of similarity between the in- and out-degree distribu- 
tion for left- and right-leaning users we find a significant difference between the in-degree distributions of left- and right-leaning 
users, but only a marginal difference between the corresponding out-degree distributions (Figure |5]l. We interpret this to mean 
right-leaning users are more likely to have a large audience of followers who may potentially rebroadcast his or her call to 
action or piece of political information. For example, left-leaning users are roughly twice as likely as right leaning users to have 
in-degree one, while users that are associated with the right are almost four times more likely to have in-degree 1000 than users 
associated with the left. 

Additionally, users in the left-leaning community are more likely to be only peripherally connected into the network, as 
evidenced by the distribution of the fc-core shell indices of users in each community (Figure |6]l. For a given network, the fc-core 
is the maximal subgraph whose nodes (as members of the subgraph) have at least degree k, or, in other words, have at least k 
neighbors in the fc-core itself. The shell index, c, of a node refers to the coreness (fc) of the highest-order fc-core of which the 
node is a member ( |Barrat et al.||2008"l l. 

These observations leads us to conclude that there are substantial structural differences in the fundamental patterns of social 
connectivity among politically left- and right-leaning Twitter users, a finding supported by the seminal work of Adamic & 
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FIG. 6 Linearly binned core distribution of the internal follower network. The difference between these two distributions is highly significant 
(Kolmogorov-Smimov p < 10"''). 



TABLE V Retweet network statistics for the subgraphs induced by the set of edges among users of the same political affiliation. 

Community Nodes Edges Avg. Degree Clust. Coeff. Reciprocity 
Left 11,353 32,772 2.88 0.032 13.5% 

Right 7,115 39,713 5.58 0.045 12.1% 



Glance ( |Adamic and Glance) |2005[ ) on the connectivity patterns of high-profile partisan bloggers. Specifically, the right-leaning 
community is much more densely interconnected, with more users tightly integrated into the right-leaning social network. In 
contrast, the network of follower/followee relations among left-leaning users exhibits a much more decentralized, loosely- 
interconnected structure, with far fewer mutually-affirmed social connections. 




B. Retweet Network 

Next we consider the structure of the network of political retweets in order to understand how information actually spreads on 



the social substrate characterized in Section V.A While each link in the follower network represents a potential pathway along 
which information may flow, edges in the retweet network correspond to real information propagation events. Specifically, when 
user A rebroadcasts a tweet produced by user B, she explicitly signifies receipt of the content in question, and thus we draw 
an edge from user B to user A indicating the direction of information flow. Consequently, the structure of the retweet network 
reveals much about how information actually spreads within these two communities. Visualized previously in Figure [7) basic 
statistics describing the networks induced by retweets containing at least one political hashtag between users of the same partisan 
affiliation are show in Table IVl 

In practice, the tightly-interconnected structure of the follower network confers communication advantages to the right-leaning 
community of users. Examining the in- and out-degree distributions for these two communities we find that though the power- 



12 




FIG. 7 The network of political retweets, laid out using a force-directed algorithm. Node colors reflect cluster assignments, which correspond 
to politically homogeneous communities of left- and right-leaning users with 87% accuracy. (See § |III.C^ . 



law exponents are similar, the difference between them is statistically significant at the 95% level (Figiire|8]l. The faster decay in 
the degree distribution of the left-leaning community implies that right-leaning users are rebroadcast by and rebroadcast content 
from a larger number of individuals than users on the left. That right-leaning users pay attention to more information sources 
compared to left-leaning individuals is indicative of a higher degree of engagement with the Twitter platform itself. Similarly, 
an individual wishing to rapidly reach a wide audience has a natural advantage given the structure of the right-leaning retweet 
network. 

With respect to the number of users in high-order fc-cores, too, we see that the right-leaning community enjoys structural 
advantages, with a greater proportion of highly active users connected to other highly active users (Figure [9]l. This difference 
could lead to consequences in the spread of information through these networks. Work by Kitsak et al. indicates that it is 
individuals with high shell index, rather than those who are most central or well connected, who are the most effective spreaders 
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FIG. 8 Log binned in- and out-degree distributions for the left- and right-leaning retweet network communities. Slopes and standard errors 
were inferred using the maximum likelihood estimation method described by Clauset, Shalizi & Newman ( [Clauset et al.[ [2007 l. The rapid 
decay of the left-leaning degree distribution indicates that right-leaning users are retweeted by and retweet content from a larger number of 
users than those on the left. 



of information under a simple SIR-based information diffusion model ( Kitsak et al. 2010| l. Users on the right therefore, are more 
likely than those on the left to be wired into the political communication network in such a way that they are able to facilitate 
the broad and rapid dissemination of political information. 

We also find that a substantially higher proportion of right-leaning user participate in fully-connected subgraphs of size 
k, known as k-cliques. This result is especially important in the context of the complex contagion hypothesis, which posits 
that repeated exposures to controversial behaviors are essential to the adoption of these behaviors. Work by Romero, Meeder 
and Kleinberg focused specifically on online social networks indicates that this effect is particularly pronounced for political 



discourse on Twitter (Romero et al. 2011 1. With fewer users in high-order fc-cores, individuals in the left-leaning community 
will be less likely to encounter multiple users discussing the same partisan talking points or calls to action, exactly the kind of 
contentious content whose propagation is most likely to benefit from repeated exposure. 
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FIG. 9 Proportion of users with a given fc-core sliell index (left) and membership in a k-clique (right) for the retweet network. 



TABLE VI Mention network statistics for the subgraphs induced by the set of edges among users of the same political affiliation. 

Community Nodes Edges Avg. Degree Clust. Coeff. Reciprocity 



Left 
Right 



11,353 50,273 
7,115 64,993 



4.42 
9.13 



0.053 
0.078 



20.8% 
24.5% 



C. Mention Network 



Mentions are most strongly associated with direct, conversational engagement when the target usemame appears at the be- 
ginning of a tweet, as opposed to appearing in the body text. Among the mentions in our sample, the overwhelming majority 
(94.5%) take this form, providing strong evidence that connectivity among and between users in these two groups represents 
actual political discourse rather than simply third-person references. In Table |VI| we report descriptive statistics on the topology 
of the left- and right-leaning mention networks, where an edge from A to S is drawn between two users of the same political 
affiliation if A mentions S in a tweet containing at least one political hashtag. Though the two networks exhibit very similar 
degree distributions, one important distinction is the fact that a greater proportion of mention relationships in the right-leaning 
community are reciprocal. Compared to the number of reciprocal mentions observed in degree-preserving reshufflings of the 
left- and right-leaning mention networks, the right-leaning community exhibits 7.5 times as many reciprocal mention interactions 
than is expected by chance alone, compared to a 5.6 times as many reciprocal links in the left-leaning community. Reciprocal 
interactions suggest the presence of more meaningful social connections, manifest in conversational dialogue, rather than, for 
example, unidirectional commentary on the content of another user's tweets (Huberman et al. 2009 | l. Here too, we find that 
users on the political right are more engaged with one another on Twitter, indicating that they are likely to benefit from a richer 
dialogue and hence more opportunities for frame-making and consensus building with respect to political topics. 



VI. POLITICAL GEOGRAPHY 

In addition to characterizing differences in behavior and connectivity, we can also examine the geographic distribution of 
individuals in these two communities. Here we present a cartogram in which the color of each state has been scaled to coiTespond 
to the degree to which, in that state, the observed number of tweets originating from the left-leaning community exceeds what 
we should expect by chance alone. 

Because fewer than one percent of Twitter users provide precise geolocation data, we instead rely on the self-declared 'loca- 
tion' field of each user's profile to enable geographic analysis of data at the scale of this study. As a free-text field, users are able 
to enter in arbitrary data, and non-location responses such as 'the moon' do appear in the results. Complicating this analysis 
further, some users do not report any location data, though we do not report a partisan bias in terms of non-entries. Despite these 
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FIG. 10 Deviation in volume of left-leaning political communication compared to expected baseline. Each state is filled with a color corre- 
sponding to the extent to which the observed number of tweets is above or below what should be expected in the case where each state has 
traffic volume proportional to that observed across all Twitter traffic. 



caveats, a large number of users do report actual locations, and using the Yahoo Maps Web Service AP^ we are able to make a 
best-guess estimate about the state with which a user most strongly identifies. 

Thus, for each state in which we observe N total tweets, and the relative proportion of tweets originating from left-leaning 
users (Pi), we can treat the arrival of partisan tweets as a Bernoulli process, and compute the number of tweets we should expect 
to see from left-leaning users as NPi. Likewise, we can compute the extent to which the observed number of tweets associated 
with left-leaning users (T;) is above or below the expected number, measured in terms of standard deviations, as t,-np, 



Figure 10 uses color to encode these deviations for each state, with states in which the volume of activity far exceeds what should 



be expected by chance shown in deep red, and those in which the observed volume is far below what should be expected by 
chance shown in light yellow. 

Initial inspection of this figure reveals that the geographic distribution of individuals from the left-leaning network community 
corresponds strongly to the traditional political geography of the United States. We see that left-leaning individuals feature 
prominently on the coasts and North East, and tend to be underrepresented in the midwest and plains states. 

Looking more closely, however, we find that there are some places in which the partisan makeup of tweets is quite different 
from what might be hypothesized intuitively. For example, Utah, a traditionally conservative state which at the time of this 
writing had two Republican senators, exhibits a dramatically higher volume of left-leaning content than should be expected 
by chance alone. One possible explanation for this observation could be that individuals in some states with a ideologically 
homogeneous population turn to social media as an outlet for political expression. While this is but one possible explanation 
among many, and a more rigorous analysis is required to support any definitive claim, this example illustrates the ways in which 
novel hypotheses can derive from data-driven analyses of political and sociological phenomena. 



VII. CONCLUSION 

In this study we have described a series of techniques and analyses that indicate a shifting landscape with respect to partisan 
asymmetries in online political engagement. We find that, in contrast to what might be expected given the online political 
dynamics of the 2008 campaign, right-leaning Twitter users exhibit greater levels of political activity, tighter social bonds, and a 
communication network topology that facilitates the rapid and broad dissemination of political information. 

In terms of individual behavior, politically right-leaning Twitter users not only produce more political content and devote a 
greater proportion of their time to political discourse, but are also more likely to view the Twitter platform as an explicitly political 
space and identify their political leanings in their profiles. With respect to social interactions, the right-leaning community 



^ http://developer.yahoo.com/maps/rest/Vl/geocode.html 
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exhibits a higher proportion of reciprocal social and mention relationships, are more likely to rebroadcast content from a large 
number of sources, and are more likely to be members of high-order retweet network /c-cores and A: -cliques. Such structural 
features are directly associated with the efficient spreading of information and adoption of political behavior Taken together, 
these features are indicative of a highly-active, densely-interconnected constituency of right-leaning users using this important 
social media platform to further their political views. 

This study is characteristic of an emerging mode of inquiry in the poUtical and social sciences, whereby large-scale behavioral 
data are aggregated and analyzed to shed quantitative light on questions whose scale was previously considered outside the realm 
of tractable analysis ( Lazer et al. 2009| l. Using structural features of a digital communication network one can make high-fidelity 
inferences about the political identities of thousands of individuals. Such data provide a deeper understanding of the changing 
landscape of American online political activity. Looking forward, techniques such as these are likely to become increasingly 
important as the political and social sciences rely in greater measure on large-scale digital trace data describing human opinion 
and behavior. 
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